-
Notifications
You must be signed in to change notification settings - Fork 2
feat: Implement interface monitoring setup and cleanup #49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- Added tasks for setting up an interface monitoring system, including creating necessary directories, scripts, and systemd service/timer files. - Implemented cleanup tasks to stop and remove the interface monitor service, timer, and associated files. - Enhanced the main task file to include the new interface monitoring setup.
- Updated the PortViewSet to include detailed logging for port synchronization processes, improving traceability and debugging. - Modified the get_port_status_from_results function to enhance port status checks with additional logging and conditions for determining port existence and state. - Updated the .env.example file to specify the Docker image for the backend. - Removed a console log statement in the AddBridgeDialog component for cleaner production code.
WalkthroughAdds an interface-monitoring feature (script, systemd units, timer, logrotate), corresponding cleanup tasks, extensive debug logging in port sync/status parsing, and updates default Docker image tag in .env and docker-compose for celery_beat. Changes
Sequence Diagram(s)sequenceDiagram
participant Timer as systemd Timer
participant Service as interface-monitor.service
participant Script as monitor_interfaces.sh
participant Log as /var/log/interface_monitor.log
Timer->>Service: Trigger (OnBootSec=30s, OnUnitActiveSec=30s)
Service->>Script: ExecStart (oneshot)
Script->>Script: Acquire lock file
Note right of Script: Iterate INTERFACES array\nCheck each with `ip link`
alt Interface DOWN
Script->>Script: Attempt ip link set dev <if> up
Script->>Log: Log action/result with timestamp
else Interface UP
Script->>Log: Log already UP
end
Script->>Log: Emit summary (# brought up)
Script->>Script: Release lock, handle signals
Script-->>Service: Exit (oneshot)
sequenceDiagram
participant Parser as get_port_status_from_results
participant Output as ip link output (raw)
participant Logic as Parsing logic
participant Result as Returned status
Parser->>Output: Receive raw command output
Parser->>Logic: Check NOT_FOUND indicators (various strings, empty)
alt NOT_FOUND detected
Logic->>Result: Return False
else Output present
Logic->>Logic: Extract state lines and flags (e.g., <NO-CARRIER>, <LOWER_UP>)
alt State == UP and NO-CARRIER present
Logic->>Result: Return False
else State == UP and no NO-CARRIER
Logic->>Result: Return True
else State == DOWN
Logic->>Result: Return False
else No explicit state
Logic->>Logic: Secondary pass on flags (NO-CARRIER → False, LOWER_UP → True)
alt Determined
Logic->>Result: Return True/False
else
Logic->>Result: Return None
end
end
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes
Possibly related PRs
Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (7)
ui/ui/src/components/devices/AddBridgeDialog.tsx (1)
167-173: Clean up commented-out debug logThe commented
console.logis no longer used; consider removing it (or switching to a structured logging mechanism) to avoid accumulating dead debug code in the component.backend/control_center/.env.example (1)
49-53: DOCKER_IMAGE default looks good; add trailing newline for linterSetting
DOCKER_IMAGE=taurinetech/sdn-launch-control-backend:latestaligns the example with the published image; that part looks fine.
To satisfydotenv-linter’sEndingBlankLinewarning, add a blank newline at the end of the file.backend/control_center/ansible/roles/ovs-cleanup-port-setup/tasks/cleanup_port_setup.yml (1)
19-69: Cleanup flow is consistent; consider optional hardeningThe interface-monitor cleanup sequence (stop/disable timer → remove unit files/logrotate → remove directory and logs →
daemon_reload) is coherent and mirrors the setup role nicely. Two optional tweaks you might consider:
- Also stop the
interface-monitor.serviceexplicitly, in case a run is in progress during cleanup.- Narrow
ignore_errorsif you ever want to surface genuine failures (e.g., permission issues) rather than swallowing all errors.These are not blockers given this is best-effort cleanup logic.
backend/control_center/utils/ansible_formtter.py (1)
230-347: Port-status parsing is much more robust; address small lint and clarify semanticsThe expanded logging and flag/state parsing in
get_port_status_from_resultslooks solid and should make diagnosing weirdip linkoutputs much easier. A few small follow-ups:
Ruff F541 (f-string without placeholders)
At line 313 you have an f-string with no interpolation:logger.debug(f"[PORT_STATUS] No explicit state found in output, checking flags in angle brackets")This can just be a plain string:
- logger.debug(f"[PORT_STATUS] No explicit state found in output, checking flags in angle brackets") + logger.debug("[PORT_STATUS] No explicit state found in output, checking flags in angle brackets")Non-existent ports now return
Falseinstead ofNone
For thenot_found_conditionsbranch you now returnFalse, meaning “down”. That’s reasonable, but it is a behavior change from “unknown” (None). It’d be good to confirm all call sites (e.g.,PortViewSet.sync) indeed want missing interfaces to be treated as down, and perhaps mention this explicitly in the docstring (e.g., “False if down or interface not found”).Assumption about
ip linkoutput scope (optional)
The parsing walks alloutput_lineswithout checking that a line belongs toport_name. This is fine as long as the underlying Ansible task runsip link show dev {{ port_name }}(or similar) so only one interface is present. If that task is ever broadened to show multiple interfaces, you’ll want to filter lines by the interface name before interpreting state/flags.backend/control_center/general/views.py (1)
691-763: Port.sync logging is much clearer; dedupe one error log lineThe additional
[PORT_SYNC]debug logs around playbook execution, parsed results, and DB updates significantly improve traceability of port sync behavior.One small cleanup: on failure you log the same error twice—
logger.error(f"[PORT_SYNC] Failed to sync port {port.name}: {error_msg}") logger.error(f"Failed to sync port {port.name}: {error_msg}")You can safely drop the second line (or change its level if you want both a structured and a human-friendly variant) to avoid duplicate error entries. Optionally, you might also standardize on the
[PORT_SYNC]prefix for all related log messages.backend/control_center/ansible/roles/ovs-port-setup/tasks/setup_interface_monitoring.yml (2)
71-82: Add validation for PID read from lock file.Line 73 reads the PID without validating its format. If the lock file contains invalid data, line 74 (
kill -0 "$pid") may behave unexpectedly. Add a quick sanity check.# Function to check if script is already running check_lock() { if [ -f "$LOCK_FILE" ]; then - local pid=$(cat "$LOCK_FILE") + local pid=$(cat "$LOCK_FILE" 2>/dev/null) + [[ $pid =~ ^[0-9]+$ ]] || { log_message "ERROR: Invalid PID in lock file"; return 1; } if kill -0 "$pid" 2>/dev/null; then log_message "WARNING: Monitor script already running (PID: $pid)" exit 1 else # Lock file exists but process is dead, remove it rm -f "$LOCK_FILE" fi fi }
36-44: Minor: Consider more robust interface status parsing.Line 39 uses
grep -q "state UP"to detect active interfaces. While this works for typical interface names, it could theoretically match interface names containing that string. For better reliability, consider using word boundaries or awk-based parsing.# Function to check if interface is up is_interface_up() { local interface="$1" # Check if interface exists and is UP - if ip link show "$interface" 2>/dev/null | grep -q "state UP"; then + if ip link show "$interface" 2>/dev/null | grep -qE '^\s+state UP'; then return 0 # Interface is up else return 1 # Interface is down fi }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
backend/control_center/.env.example(1 hunks)backend/control_center/ansible/roles/ovs-cleanup-port-setup/tasks/cleanup_port_setup.yml(1 hunks)backend/control_center/ansible/roles/ovs-port-setup/tasks/main.yml(1 hunks)backend/control_center/ansible/roles/ovs-port-setup/tasks/setup_interface_monitoring.yml(1 hunks)backend/control_center/general/views.py(3 hunks)backend/control_center/utils/ansible_formtter.py(2 hunks)ui/ui/src/components/devices/AddBridgeDialog.tsx(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
backend/control_center/general/views.py (2)
backend/control_center/utils/ansible_formtter.py (2)
get_single_port_speed_from_results(187-215)get_port_status_from_results(218-348)backend/control_center/general/serializers.py (1)
save(55-72)
🪛 dotenv-linter (4.0.0)
backend/control_center/.env.example
[warning] 53-53: [EndingBlankLine] No blank line at the end of the file
(EndingBlankLine)
🪛 Ruff (0.14.4)
backend/control_center/utils/ansible_formtter.py
313-313: f-string without any placeholders
Remove extraneous f prefix
(F541)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Docker Build Validation
🔇 Additional comments (7)
backend/control_center/ansible/roles/ovs-port-setup/tasks/main.yml (1)
8-12: Clean task integration.The new "Setup interface monitoring" task follows established patterns and integrates logically after port setup. The import structure is correct.
backend/control_center/ansible/roles/ovs-port-setup/tasks/setup_interface_monitoring.yml (6)
2-6: LGTM: Directory creation is straightforward.Mode 0755 is appropriate for the monitor directory.
8-14: LGTM: Log file creation with correct permissions.File is set to 0644 with root ownership, appropriate for a log file that will be rotated by logrotate.
134-152: LGTM: Systemd service configuration is sound.Type=oneshot is correct for a script that runs once per invocation. Running as root is necessary for ip link commands. Journald logging is appropriate and integrates well with system monitoring.
154-169: LGTM: Systemd timer is well-configured.The 30-second boot delay and 30-second repeat interval are reasonable for interface monitoring. The
Persistent=truesetting ensures missed timer runs are executed on system resume, which is valuable for embedded/laptop scenarios. TheRequiresdependency correctly links to the service.
171-185: LGTM: Logrotate configuration is well-balanced.The daily rotation with 7-day retention, compression, and 3MB size limit provides reasonable log management for a periodic monitoring script. The
delaycompressandnotifemptysettings follow best practices.
187-195: LGTM: Systemd daemon reload and timer activation are correctly sequenced.The daemon_reload is properly placed before enabling/starting the timer. The timer will persist across reboots and start automatically on boot with the 30-second delay.
| - name: Create interface monitoring script | ||
| copy: | ||
| dest: "/mnt/interface-monitor/monitor_interfaces.sh" | ||
| content: | | ||
| #!/bin/bash | ||
| # Network Interface Monitor Script | ||
| # Monitors USB-to-Ethernet adapters and brings them up if they're down | ||
| # Interfaces: {{ interfaces | join(', ') }} | ||
|
|
||
| # Configuration | ||
| INTERFACES=({{ interfaces | map('quote') | join(' ') }}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
Verify the interfaces Jinja variable is available in playbook context.
Line 26 uses {{ interfaces | map('quote') | join(' ') }} to populate the INTERFACES bash array. This variable must be provided by the calling playbook or role defaults. If not defined, the array will be malformed.
Confirm that:
- The
interfacesvariable is passed to this task (e.g., as a role variable or task var) - The variable is properly documented in role defaults or README
- The
map('quote')filter correctly escapes interface names with special characters
To verify this, check if the interfaces variable is defined by searching the playbook context:
🏁 Script executed:
#!/bin/bash
# Search for where `interfaces` variable is defined or passed in Ansible files
# Look for variable definitions in defaults, vars, or task invocations
find backend/control_center/ansible -name "*.yml" -o -name "*.yaml" | xargs rg -n "interfaces:" -A 2 -B 2Length of output: 0
🏁 Script executed:
#!/bin/bash
# 1. Verify the monitoring task file exists
echo "=== Checking if monitoring task file exists ==="
ls -la backend/control_center/ansible/roles/ovs-port-setup/tasks/setup_interface_monitoring.yml 2>&1
# 2. Search for "interfaces" more broadly (without colon requirement)
echo -e "\n=== Searching for 'interfaces' pattern in Ansible files ==="
find backend/control_center/ansible -type f \( -name "*.yml" -o -name "*.yaml" \) -exec rg -l "interfaces" {} \;
# 3. Check role structure for defaults and vars
echo -e "\n=== Role defaults/main.yml content ==="
if [ -f "backend/control_center/ansible/roles/ovs-port-setup/defaults/main.yml" ]; then
cat backend/control_center/ansible/roles/ovs-port-setup/defaults/main.yml
else
echo "File not found"
fi
echo -e "\n=== Role vars/main.yml content ==="
if [ -f "backend/control_center/ansible/roles/ovs-port-setup/vars/main.yml" ]; then
cat backend/control_center/ansible/roles/ovs-port-setup/vars/main.yml
else
echo "File not found"
fi
# 4. Search for playbooks that might call this role
echo -e "\n=== Searching for playbook files that call ovs-port-setup role ==="
find backend/control_center/ansible -type f -name "*.yml" -o -name "*.yaml" | xargs rg -l "ovs-port-setup" | head -10Length of output: 1173
🏁 Script executed:
#!/bin/bash
# 1. Read the main playbook to understand how the role is invoked
echo "=== Main playbook: ovs-port-setup.yml ==="
cat backend/control_center/ansible/playbooks/ovs-port-setup.yml
# 2. Check role's tasks/main.yml if it exists
echo -e "\n=== Role tasks/main.yml ==="
if [ -f "backend/control_center/ansible/roles/ovs-port-setup/tasks/main.yml" ]; then
cat backend/control_center/ansible/roles/ovs-port-setup/tasks/main.yml
else
echo "File not found"
fi
# 3. Check add_ports.yml to see how it references interfaces
echo -e "\n=== Role add_ports.yml (first 50 lines) ==="
head -50 backend/control_center/ansible/roles/ovs-port-setup/tasks/add_ports.yml
# 4. Check the full setup_interface_monitoring.yml to see context of interfaces usage
echo -e "\n=== Full setup_interface_monitoring.yml ==="
cat backend/control_center/ansible/roles/ovs-port-setup/tasks/setup_interface_monitoring.ymlLength of output: 6571
🏁 Script executed:
#!/bin/bash
# Check for group_vars and host_vars directories
echo "=== Searching for group_vars and host_vars ==="
find backend/control_center/ansible -type d -name "group_vars" -o -type d -name "host_vars"
echo -e "\n=== Content of group_vars directories ==="
find backend/control_center/ansible -path "*group_vars*" -type f | head -20
echo -e "\n=== Content of host_vars directories ==="
find backend/control_center/ansible -path "*host_vars*" -type f | head -20
echo -e "\n=== Searching for interfaces variable definition in all yml/yaml files ==="
rg -n "^\s*interfaces\s*:" backend/control_center/ansible --type yaml -C 3 | head -100
echo -e "\n=== Check inventory files ==="
find backend/control_center/ansible -name "inventory*" -o -name "hosts*" | head -10Length of output: 238
The interfaces variable is undefined throughout the codebase and must be defined before this role executes.
Verification found that {{ interfaces }} is used in both setup_interface_monitoring.yml (lines 24, 26) and add_ports.yml, but the variable is never defined in:
- The playbook (
ovs-port-setup.yml) - Role defaults or vars files (do not exist)
- Group vars or host vars (do not exist)
- Any other YAML configuration file
Without this variable, the Jinja template will produce an empty or malformed INTERFACES array, and the monitoring script will be non-functional. The with_items: {{ interfaces }} in add_ports.yml will also fail.
Required action: Define interfaces as a list variable (either in playbook vars, role defaults/main.yml, or pass via command line with -e interfaces=[...]).
🤖 Prompt for AI Agents
In
backend/control_center/ansible/roles/ovs-port-setup/tasks/setup_interface_monitoring.yml
around lines 16 to 26, the Jinja variable {{ interfaces }} is undefined and must
be declared as a list before this role runs; add a list variable named
interfaces (e.g., interfaces: ['ethX', 'ethY']) either in the role's
defaults/main.yml, in the playbook vars of ovs-port-setup.yml, or pass via -e on
the command line so both setup_interface_monitoring.yml and add_ports.yml can
iterate it; ensure it is a YAML list (not a string), contains the correct
interface names, and is available to the role scope.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
backend/control_center/.env.example(1 hunks)backend/control_center/docker-compose.yml(1 hunks)
🧰 Additional context used
🪛 dotenv-linter (4.0.0)
backend/control_center/.env.example
[warning] 53-53: [EndingBlankLine] No blank line at the end of the file
(EndingBlankLine)
🔇 Additional comments (2)
backend/control_center/docker-compose.yml (1)
77-90: Service image tag inconsistency: celery_beat pinned to 1.0.0-beta while sibling services default to latest.Line 80 updates the celery_beat service to use
sdn-launch-control-backend:1.0.0-betaas the fallback, while the django service (line 33) and celery service (line 65) retainlatestas their fallbacks. This creates an inconsistency where celery_beat and other services may run different image versions whenDOCKER_IMAGEis not explicitly set in .env.Verify this is intentional. If all services should use the same version, consider updating line 65 (celery service) and line 33 (django service) to also reference
1.0.0-beta, or update line 80 back tolatestto maintain consistency.backend/control_center/.env.example (1)
49-53: Verify pre-release Docker image exists and confirm intent to default to beta version.Line 53 sets an active, concrete
DOCKER_IMAGEvalue pointing totaurinetech/sdn-launch-control-backend:1.0.0-beta. This represents a shift from the earlier example (lines 44-47) that showed a placeholder. Two concerns:
Image availability: Confirm that the
taurinetech/sdn-launch-control-backend:1.0.0-betaimage exists and is publicly accessible on Docker Hub. If this is a pre-release or internal build, new deployments using .env.example will fail unless the image is available.Default version policy: Using a pre-release (beta) version as the default in .env.example may not be appropriate for production deployments or new users. Consider whether the stable release version should be the default, with beta as an opt-in option via comments or documentation.
| # Format: username/repository:tag or repository:tag | ||
| # Example: myusername/sdn-launch-control-backend:latest | ||
| # DOCKER_IMAGE=sdn-launch-control-backend:latest No newline at end of file | ||
| DOCKER_IMAGE=taurinetech/sdn-launch-control-backend:1.0.0-beta No newline at end of file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add blank line at end of file.
Missing blank line at end of file. POSIX convention and most linters expect files to end with a newline character.
Apply this diff:
-DOCKER_IMAGE=taurinetech/sdn-launch-control-backend:1.0.0-beta
+DOCKER_IMAGE=taurinetech/sdn-launch-control-backend:1.0.0-beta
+📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| DOCKER_IMAGE=taurinetech/sdn-launch-control-backend:1.0.0-beta | |
| DOCKER_IMAGE=taurinetech/sdn-launch-control-backend:1.0.0-beta | |
🧰 Tools
🪛 dotenv-linter (4.0.0)
[warning] 53-53: [EndingBlankLine] No blank line at the end of the file
(EndingBlankLine)
🤖 Prompt for AI Agents
In backend/control_center/.env.example around line 53, the file is missing a
trailing newline (POSIX convention and linters expect a newline at EOF); fix by
adding a single blank line/newline character at the end of the file so the file
ends with a newline.
Summary by CodeRabbit
New Features
Bug Fixes
Chores